Database Dependency Discovery: A Machine Learning Approach

نویسندگان

  • Peter A. Flach
  • Iztok Savnik
چکیده

Database dependencies, such as functional and multivalued dependencies, express the presence of structure in database relations, that can be utilised in the database design process. The discovery of database dependencies can be viewed as an induction problem, in which general rules (dependencies) are obtained from specific facts (the relation). This viewpoint has the advantage of abstracting away as much as possible from the particulars of the dependencies. The algorithms in this paper are designed such that they can easily be generalised to other kinds of dependencies. Like in current approaches to computational induction such as inductive logic programming, we distinguish between topdown algorithms and bottom-up algorithms. In a top-down approach, hypotheses are generated in a systematic way and then tested against the given relation. In a bottom-up approach, the relation is inspected in order to see what dependencies it may satisfy or violate. We give algorithms for both approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovery of Dependency Tree Patterns for Relation Extraction

Relation extraction is to identify the relations between pairs of named entities. In this paper, we try to solve the problem of relation extraction by discovering dependency tree patterns (a pattern is an embedded sub dependency tree indicating a relation instance). Our approach is to find an optimal rule (pattern) set automatically based on the proposed dependency tree pattern mining algorithm...

متن کامل

Discovery of Data Dependencies in Relational Databases Lss8 Report 14 Discovery of Data Dependencies in Relational Databases Lss8 Report 14

Knowledge discovery in databases is not only the nontrivial extraction of implicit, previously unknown and potentially useful information from databases. We argue that in contrast to machine learning, knowledge discovery in databases should be applied to real world databases. Since real world databases are known to be very large, they raise problems of the access. Therefore, real world database...

متن کامل

A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features

Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...

متن کامل

Anomaly detection through quasi-functional dependency analysis

Anomaly detection problems have been investigated in several research areas such as database, machine learning, knowledge discovery, and logic programming, with the main goal of identifying objects of a given population whose behavior is anomalous with respect to a set of commonly accepted rules that are part of the knowledge base. In this paper we focus our attention on the analysis of anomaly...

متن کامل

Mining Rare Association Rules by Discovering Quasi- Functional Dependencies: An Incremental Approach

Mining rare events is a critical task in several research areas such as database, machine learning, knowledge discovery, fraud detection, activity monitoring, and image analysis. In all these situations the main goal is to identify objects of a given population whose behavior is anomalous with respect to a set of rules part of the knowledge base, usually expressed by means of some statistical k...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • AI Commun.

دوره 12  شماره 

صفحات  -

تاریخ انتشار 1999